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1. INTRODUCTION 

Renewable energy is becoming increasingly significant in the generation of power these days. Fossil 
resources are not a viable future choice since they are non-renewable energy sources that contribute to 
environmental degradation. In 2019, 6,963 TWh of electricity was generated from renewable sources. About 
6% of this (4,207 TWh) came from renewable hydropower, with the rest coming from wind and solar power 
(1 412 TWh and 693 TWh, respectively) [1]. 

Solar energy is one of the world's fastest-growing energy sources, and with countries competing for 
supremacy in the thriving industry. In Africa, Morocco has set one of the world's most ambitious energy 
goals. The objective is for renewable energy to account for 42% of total electricity from its solar farms; the 
world's largest concentrated solar farm [2]. 

Despite the many benefits of solar panels and renewable energy, solar panels need no maintenance 
and may be allowed to produce cost-free renewable energy. They may, occasionally, run into one of a few 
solar photovoltaic (PV) issues. There are a variety of reasons why photovoltaic (PV) modules may fail: 
temperature cycling, humidity freeze, and ultraviolet (UV) exposure [3]. 
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Solar panels failures ecosystems must be monitored, measured, and analyzed continuously and 
automatically to better understand the complex, multi-variate, and unpredictable nature of these issues. The 
internet of things (IoT), a new developing technology that connects physical objects through electrical 
sensors and the internet, is getting a lot of attention these days. This IoT technology is growing into a wide 
range of new and interesting application fields, with energy being one of them. For optimal real-time 
consumption monitoring and performance awareness, energy management integrates IoT technologies to 
offer the perfect solution. IoT technology, such as energy sensors, makes it possible to gather real-time data 
on energy usage at many levels, such as the machine, the production line, or the facility level [4]. Deep 
learning is another technique that has made significant breakthroughs in a variety of fields since its 
introduction, including computer vision, natural language processing (NLP), energy, anomaly detection, 
failure forecasting, and many others. Combining these breakthroughs technologies, IoT and deep learning, 
can provide a viable approach for preventing solar panel failures. In this paper, we provide a thorough 
literature review analysis on PV failure detection using IoT and deep learning technologies. The structure of 
this paper is as follows. Section two goes over the terminology. Section three explains the literature review, 
and the fourth section discusses our findings, and we conclude with a conclusion. 


2. BACKGROUNDS 
2.1. Photovoltaic (PV) 

Photovoltaic (PV) is the direct transformation of solar irradiation into electricity by solar cells; 
based on the physical principle of photoelectricity (see Figure 1). The direct current generated during this 
process is usually converted to alternating current by an inverter and then fed into the utility grid [5]. The 
majority of solar cells are made of silicon semiconductors, which are similar to those used in the production 
of computer chips. These semiconductors convert electromagnetic radiation (light) into electric current: 
incident light particles (photons) are absorbed in the semiconductor, raising the electrons of the 
semiconductor material to a higher energy level and allowing them to move through the material. 
Semiconductors are designed in such a way that charge separation (electrons or electron vacancies) occurs 
(thanks to the adjacent differently doped layers). The generated current is collected at the level of the metal 
contacts [6]. Solar panels are relatively low maintenance. However, nothing is completely foolproof; 
problems can arise [7], [8]. Delamination and internal corrosion, electrical issues, micro-cracks, hot spots, 
potential induced degradation (PID) effect, Snail trails, inverter problems, and other issues are some of the 
most common problems that affect solar panels [9]. 


Figure 1. Solar plant ongrid 


2.2. Photovoltaic maintenance 

The implementation of a maintenance system can help to avoid a slew of issues and boost 
productivity. Industrial maintenance entails not only facility inspections, but also accurate data collection on 
the state of infrastructure, equipment, and machinery. Many businesses rely on technology companies that 
specialize in monitoring industrial processes to accomplish this. These technological tools take daily 
measurements of key indicators and send out alerts when a measurement deviates from the norm. In addition 
to that, it is so important to distinguish between the three main types of maintenance: Corrective, preventive, 
and predictive maintenance [10]. Corrective maintenance, which consists in intervening on an equipment 
when it fails, as opposed to preventive maintenance, which consists in intervening on an equipment before it 
fails, in order to prevent any failure. Predictive maintenance is performed based on projections derived from 
the analysis and evaluation of key parameters of asset degradation. Its basic premise is that any element will 
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show signs of degradation, whether visible or not, that indicates its failure. The key is to understand how to 
recognize these warning signs. Many existing devices (sensors and thermal cameras) allow the measurement 
of this degradation, which can take the form of changes in temperature, vibration, pressure, size, position, and 
noise, among other things. Physical, chemical, behavioral, electrical, and other types of degradations can 
occur [11]. In this context, numerous prior studies have examined photovoltaic failure categories. While 
large-scale solar farms tend to receive more research funding, the bulk of current PV technology research has 
focused on these larger projects due to the increased funding and incentives that larger projects can offer. But 
some PV system problems are common to both large and small-scale systems. Frequent system failures 
include the following types of typical PV (system) issues, as described in the literature [12]. 


2.2.1. Ground faults (zero efficiency faults) 

In an electrical system, the most common type of fault is the ground fault. When the insulation is 
degraded, it becomes porous and is ultimately unable to protect the wires and equipment, and this occurs 
when it is exposed to excess current, extreme temperatures, and aging, and in some cases when voltage levels 
are abnormal. Without insulation, the conductor may be in contact with an external object. However, if 
another ground defect occurs, a leakage current circulates through the ground to return between ground 
defects [13]. 


2.2.2. Line to line faults 

To reach both voltage and power levels, strings of panels are connected in series and then the strings 
are connected in parallel to create an array. Unintentional connections between two different points in a PV 
array are known as line-to-line (L-L) faults [14]. DC connectors damage, animal chewing, and cable age may 
cause the L-L faults [15]. 


2.2.3. Inverter failures 

Solar panels provide electricity that is used to power household appliances through solar inverters, 
which need minimal maintenance if set up properly. Inverters include more electrical components than solar 
panels. In comparison to microinverters, string solar inverters have a lifespan of around ten years. However, 
even though inverters are designed to endure for decades, a variety of conditions may impair their function 
during that time period, such as such as heat, faulty installation, humidity, poor maintenance, edge 
delamination, water penetration, and high string voltage [16]. Components are very sensitive to temperature. 
Too much heat may decrease electrical production. Clean dust filters and unimpeded inverter airflow are 
essential [17]. 


2.2.4. Arc faults 

In PV systems, arc faults are a frequent occurrence. A prolonged arc's high-temperature plasma may 
harm system components severely. Solar PV systems are susceptible to two kinds of arc faults: series and 
parallel (including grounding arc-fault). Due to the significant difference in potential between a parallel and 
grounding arc fault, a considerable quantity of fault current is drawn, making it simpler for conventional 
protection systems to detect. A series arcing fault current that is lower than the usual operating current level 
will not melt or trigger overcurrent safety mechanisms because of the nature of a photovoltaic solar cell. Due 
to this, the arc fault in series does not draw an opposite current like the arc fault in parallel and the total fault 
current is derived from the normal load current [18]. 


2.2.5. Microcracks 

PV modules have a real issue with microcracks in solar cells. They're difficult to prevent and, as of 
yet, almost impossible to measure in terms of their long-term effect on the module's efficiency. A fresh 
module's power may be somewhat reduced by the existence of microcracks, as long as the various 
components of the cell are still electrically linked. A repetitive relative movement of fractured cell 
components may cause an electrical separation as the module ages and is exposed to heat and mechanical 
stressors [19]. 


2.2.6. Hot spots and shading 

Shading is the most common issue that affects all solar-electric systems. Because clouds and 
barriers cannot be physically moved, it is critical to identify and eliminate any sources of hotspots, thereby 
reducing the negative effects of partial shading. In the case of non-homogeneous radiation striking PV 
surfaces, the use of photovoltaic panels with internally integrated bypass diodes prevents the possibility of 
PV burning from occurring [20]. 
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2.3. Internet of things (IoT) 

The Internet has expanded dramatically over the last 50 years, from a local research network with 
only a few nodes to a ubiquitous global network with over a billion users. The ability to obtain distant sensor 
data and manage the physical world from a distance is made feasible by connecting physical objects to the 
Internet. The combination of captured data with data acquired from other sources, such as data on the 
Internet, results in new synergistic services that go beyond what an isolated embedded system can deliver. 
This vision is the foundation of the IoT [21]. A smart device is just another name for an Internet-connected 
embedded device [22].The IoT is a network of interconnected computing objects/devices, digital and 
mechanical, or items with unique IDs and the capacity to transfer data without the need for human 
interactions. A single device on the Internet can be a human with a cardiac eHealth device, an animal with a 
biochip transponder, a car with integrated sensors, or like in our case a smart photovoltaic panel that transfers 
the telemetries via the internet (see Figure 2) [23]. 
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Figure 2. Internet of things 


2.4. Machine learning (ML) 

Artificial intelligence and machine learning (ML) techniques revolutionize several industrial and 
academic sectors such as natural language processing, computer vision, cybersecurity, speech recognition, and 
autonomous driving [24]. ML is a data analysis technique that automates the construction of the analytical 
model. It is an AI branch that believes that systems can learn from information, detect patterns, and decide with 
a minimum of human interaction [25]. ML approaches were limited in processing natural data in their raw form 
and require considerable knowledge in the construction of an extractor that turns raw data into a suitable 
representation [26]. Deep learning has come to overcome this challenge by providing simpler depictions [23]. 


2.5. Deep learning (DL) 

Deep learning algorithms can be viewed as a more complex and advanced version of machine 
learning algorithms. As a result of recent advancements, the field has attracted a great deal of interest, and 
with good cause. Notably, supervised and unsupervised learning both allow for this [27]. DL applications 
utilize an artificial neural network (ANN) to achieve this. A neural network inspired by the human brain's 
biological neural network is used to create an ANN that is much more competent than traditional machine 
learning models at learning [28]. 


2.5.1. Artificial neural network (ANN) 

An artificial neural network is a system that consists of linked units that include a high number of 
neurons. Each neuron in the network has the ability to receive, process, and output input signals. It is 
composed of a set of weighted connections, an adder for combining input data weighted by synaptic strength, 
and an activation function for limiting the intensity of the neuron's output [29]. Multilayer feedforward 
networks and recurrent networks are two fundamentally distinct types of network topologies. 


2.5.2. Feedforward neural network (FNN) 

Feedforward networks are currently being employed with remarkable success in a number of 
applications. It consists of many neurons organized in layers. Each layer of neurons is connected to all the 
neurons preceding it in the layer (see Figure 3). These connections aren't all created equal; each one may differ 
in terms of strength or weight [29]. The term "single-layer" refers to a neural network with only one layer [30]. 
A network multilayer feedforward consists of a source unit input layer, one or more layers, and an output layer. 
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The hidden layers in the FNN are not directly visible either from the network's input or output layer. These 
hidden layers allow the neural network to retrieve statistical characteristics in greater order from its input [31]. 
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Figure 3. Feedforward neural networks layers 


2.5.3. Convolutional neural network (CNN) 

The ConvNet/convolutional neural network (CNN) is a DL algorithm that can take an input picture, 
assign significance (weights and biases) to numerous aspects in an image, and differentiate between them 
[32], CNN are a regularized versions of multilayer FNN. When compared to other classification methods, the 
amount of pre-processing required by a ConvNet is significantly less. While basic techniques need hand- 
engineering of filters, ConvNets can learn these filters/characteristics with enough training. The ConvNet 
design is similar to the human brain's connection network, and it was inspired by the visual cortex 
organization [33]. For a convolutional neural network, there are four sorts of layers: the convolutional layer, 
the pooling layer, the ReLU layer, and the fully-connected layer (see Figure 4) [34]. 
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Figure 4. CNN layers architecture 


2.5.4. Recurrent neural networks (RNN) 

The RNN is a form of artificial neural network that employs sequential data or a series of temporal 
data. This DL algorithm is often used for regular or temporal issues such as linguistic translation, language 
processing (NLP), speech recognition, and image subtitling [35]. They can also be used for other 
applications. Recurring neural networking use training data to learn, like feedforward and CNN. They are 
characterized by their "memory", which allows them to alter current input and output by using knowledge 
from previous inputs (see Figure 5) [36]. RNNs typically experience two issues throughout this process: 
exploding gradients and vanishing gradients [37], To address these problems, the most well-known RNN 
versions; the long short-term memory (LSTM) and gated recurrent unit (GRU), are used. 


2.5.5. Long short-term memory (LSTM) 

Long short-term memory (LSTM) includes a series of recurrently connected subnetworks, consisting 
of memory blocks. These blocks include one or more self-connected memory cells, which they retain for the 
remembering of past data and 3 components known as gates: an input gate, gate forget, the external gate which 
is an ongoing equivalent of writing, reading and retrieving (see Figure 6), [38]. The principal difference with 
simple RNN is that the nonlinear units are superseded by memory blocks in hidden layers [39]. 
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2.5.6. Generative adversarial network (GAN) 

Generative adversarial network is one of the most prominent techniques for deep generative 
modeling currently. Instead of the data distribution. Generative modeling is an unsupervised learning form of 
ML which automatically includes the discovery of regularities or patterns in input data in a way in which new 
instances which would likely have been chosen from the original dataset may be generated or produced by 
the model [40]. GANs are an intelligent process of developing a generative model by framing the problem as 
an under-controlled learning problem with two sub-models: the model Generator, which trains to produce 
new examples, and the model discriminator, which attempts to categorize examples as either genuine (real) 
or fake (generated). The two models are trained concurrently in a zero-sum contest, adversarial until the 
model of discriminator has been deceived for roughly half of the time [41]. 


2.5.7. Adversarial autoencoder (AAE) 

The AAE is a brilliant idea to mix the autoencoder architecture with a GAN notion for adverse loss. 
The variative autoencoder (VAE) employs a similar idea except that the latent code is regulated using 
adverse loss, instead of the KL-divergence used by the VAE [42]. In variative autoencoder, a KL-divergence 
is used to match the encoded latent code with a normal distribution (or any arbitrary distribution) [43]. AAE 
substitutes this with an adverse loss if the encoder adds an extra discriminating element. Unlike GAN, where 
the generator's output is the produced data (mostly picture) and the discriminator's input is both genuine and 
phony data, AAE's generator creates a latent code and attempts to convince the discriminator that the latent 
code is sampled from the selected distribution (see Figure 7) [44]. 
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Figure 7. Adversarial autoencoder layers [44] 


2.6. DL evaluation metrics 

It is essential to have a good evaluation metric in place to help find a classifier throughout the 
classification training. A proper assessment measure is therefore a crucial element in making a distinction 
and getting the best classifier [45]. When evaluating deep learning models, certain metrics must be used, such 
as accuracy, precision, recall, F1 score, MSE, MAE, and the AUC. In order to calculate these metrics, four 
different measures are used [46]: 
— True Positive (TP): is the number of positive class records classified correctly. 
— True Negative (TN): is the number of negative class records classified correctly. 
False Positive (FP): is the number of negative class records classified wrongly. 
— False Negative (FN): is the number of positive class records classified wrongly. 
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2.6.1. Accuracy 
Is the percentage of correct predictions among all predictions [47], and it is calculated using (1): 


TP+TN 
TP+TN+FP+FN 


Accuracy = (1) 
there are many flaws in accuracy, however, including a lack of uniqueness, a lack of discriminability, a lack 


of informativeness, and a preference for data from the majority class [45]. 


2.6.2. Precision 
Is the percentage of all positive results that were accurately identified [48], and it is calculated using (2). 


Precision = =m (2) 
TP+ FP 
2.6.3. Recall 
Is the proportion of accurately identified positive results among the total number of existing positive 
classes [48], and it is calculated using (3). 


Recall = —=— (3) 
TP+ FN 


2.6.4. F1-score 

The Fl-Score is a subtle combination of precision and recall. It is interesting, even more than 
accuracy, because the number of true negatives (TN) is not considered [49]. A high number of true negatives 
(TN) will have no effect on the Fl-score. It is calculated using (4). 
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F1 (4) 
2.6.5. Receiver operator characteristic (ROC)-Area under curve (AUC) 

ROC curve is a binary classification task evaluation metric. It is a probabilistic curve that plots the 
‘true positive rate’ against the ‘false positive rate’ at various threshold levels, separating the ‘signal’ from the 
‘noise.’ The area under the curve (AUC) is a measure of a classifier's ability to differentiate between classes 
which are used to summarize the ROC curve. The greater the AUC, the better the model's accuracy in 
differentiating between positively and negatively categories [50]. 


2.6.6. Mean absolute error (MAE) and root mean squared error (RMSE) 

MAE and RMSE are two of the most widely used metrics for evaluating the accuracy of continuously 
varying variables [51]. MAE measures the average erroneous magnitude without taking into account the 
direction of the errors. All disparities have the same weight in the test sample, so the average of the absolute 
errors between prognostication and actual observation is used [52]. It can be calculated using (5): 


MAE = saly =9,| (5) 


RMSE is a quadratic evaluation rule that also measures the average magnitude of the error. The difference 
between what was predicted and what was observed squared is the square root of that difference [52], It can 
be calculated using (6). 


RMSE = E ae- i 


3. METHOD 

For our research we applied the following combination of the related keywords, "("deep learning" 
AND (oT OR "Internet of Things") AND ("PV" OR photovoltaic OR "solar panel")) " that corresponds to 
the purpose of this review and obtained approximately 32 documents as a result, published from 2018 until 
September 2021 (Figure 8). Following that, we excluded some papers for the reason that they were only the 
first few pages of conference proceedings and not actual articles, and we also excluded some irrelevant 
papers due to their relevance to our research area; they concentrated on forecasting solar radiation without 
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including the maintenance context and that has no bearing on our subject. These gathered papers (see 
Table 1) were extracted from the Scopus database, which is the largest abstract and indexing database of 
peer-reviewed literature, containing publications, conference proceedings, patent records, and websites in the 
most important subject fields [53]. 


Documents by year Scopus 
16 
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o 
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2018 2019 2020 2021 
Year 
Copyright © 2022 Elsevier B.V. All rights reserved. Scopus® is a registered trademark of Elsevier B.V. 
Figure 8. Scopus indexed papers per year 
Table 1. Summary of reviewed literature 
‘ i Deep Learning Type of Anomaly Context Best performance 
Tear ance da Model maintenance /Goal /Dataset results 
2021 54] Digital twins in solar farms: An DT: CNN and Preventive General Own collected Precision: 0.53 
approach through time series and LSTM anomalies data: 22427 Recall: 0.92 
deep learning Samples AUC: 0.97 
2021 55] Deep Learning Enhanced Solar CNN and Predictive Power Own collected RMSE (STP): 
Energy Forecasting with AT- LSTM Prediction data 1.30 
Driven IoT 
2021 56] Deep Learning at the Edge for ANN Preventive Shading Own RMSE<0.05 
Operation and Maintenance of collected data 
Large-Scale Solar Farms 
2020 57] Using Siamese networks to detect ANN-Siamese Preventive Shading Own collected F1 Score: 0.94 
shading on the edge of solar farms Neural data: 600 
Network samples 
2020 58] Very Short-Term Solar Irradiance WT-CNN Predictive Power Own MAE: 1.63 
Forecasting at a Sub-Minute Scale prediction collected data RMSE: 2 
Based on WT-Cnns 
2020 59] TOT based solar energy prophecy CNN-LSTM Predictive Power Own collected MAE: 0.2 
using RNN architecture prediction data MSE: 0.1 
2020 60] A new architecture based on iot CNN-LSTM Predictive Power Opera digital MAE: 274.87 
and machine learning paradigms Prediction systems RMSE: 531.08 
in photovoltaic systems to Dataset [61] 
nowcast output energy 
2020 62] Integrating iot devices and deep LSTM Predictive Power Own collected RMSE: 85.49 
learning for renewable energy in prediction data 
big data system 
2020 63] Power Prediction via Module MLP Predictive Power Own MAE: 0.08 
Temperature for Solar Modules prediction collected data: RMSE: 0.10 
Under Soiling Conditions 800 samples 
2020 64] Deep Convolutional Neural CNN Corrective Physical Own collected Recall: 0.74 
Network for Automatic Detection crack Data: 3336 Precision: 0.70 
of Damaged Photovoltaic Cells samples F1 Score: 0.69 
2019 65] DA-DCGAN: An Effective DA-DCGAN Preventive Arc faults Own collected Accuracy: 98.5% 
Methodology for DC Series Arc Data: 40 000 
Fault Diagnosis in Photovoltaic samples 
Systems 
2019 66] CNN based automatic detection of CNN Preventive Cracks and elpv Dataset Accuracy: 
photovoltaic cell defects in microcracks [67] 93.02% 


electroluminescence images 
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4. FINDINGS 

These research findings in the preceding part (Table 1) will be examined in this section; first, we 

will illustrate the comparison criteria used: 

— Deep learning model: The deep learning models utilized in the mentioned papers 

— Type of maintenance: Corrective, preventive, and predictive maintenance 

— Anomaly/goal: Define the type of default detected/the main purpose of the model 

— Context/dataset: the data used to train and test the proposed deep learning model 

— Best performance results: This criterion displays the highest results for the proposed model or the used 
metrics such as accuracy, recall, precision, F1 score, MAE, MSE, AUC, or others. 

Regarding the used data in each paper, big and small datasets are used to train the DL model; some 
include thousands of entries, while others contain just a few; these entries may be realistic or synthetic, 
created by the authors [54]—[59], [62]-[65]. A number of datasets are created by researchers for their own 
study purposes, while some are taken from well-known and publicly available datasets [60], [66], such as 
“elpv-dataset”. In general, the more data needed to solve a problem, the more complex the problem is. As an 
example, training models for tasks such as class identification when there are many classes and/or little 
variation among the classes necessitates using a large number of input data. Too little training data, as is 
well-known, leads to poor approximations. With an over-constrained model, it will be difficult to learn from 
the limited training dataset, while with a model that is under-constrained, it will be much easier. An overly 
optimistic and too high variance estimate of model performance will be the consequence of using insufficient 
test data. 

From a technical perspective, almost all of the research papers used the widely-used CNN or LTSM 
algorithms [55], [62], [64], [66]. Besides some developed their own variants of the CNN or LTSM models 
[54], [55], [58]-[60], and the rest of them worked on the traditional ANN and MLP [56], [57], [63], and one 
paper worked with a GAN variant named it “DA-DCGAN” [65]. The authors applied classification classes 
ranging from 2 (binary anomaly detection) [56], [57], [64]-[66] through to 3 for the [54] (multiclass 
classification), other authors used regression methods to predict the output power of the solar plant [55], 
[58]—[60], [62], [63]. The number of model outputs in these studies matched the number of classes. For each 
of the possible classes of input data, the model produced a probability value, and the highest probability 
value was selected as the predicted class. 

In accordance with the main objectives of these research papers (the maintenance of photovoltaic 
solar panels), some are attempting to build a model able to detect any default occurring while the solar plant 
is running in order to prevent any breakdown. The first thing that stands out is that the majority of the papers 
are dealing with the power prediction [55], [58]—[60], [62], [63]. The connection between the system output 
prediction and his maintenance is not immediately apparent at first glance, but in fact, PV maintenance can 
be effectively aided by forecasting power generation: it is considered as a reference for alert thresholds, and 
more important is the stability of the electrical network, when our plant is ongrid. The shading phenomenon 
is also considered as a major factor of degradation in the solar PV industry. It is to blame for the module's 
temperature rising, resulting in a reduction in power output. The proposed models in the papers [56], [57] are 
showing good results, with an error value lower than 0.05 (RMSE). For the rest, they are specialized in the 
physical anomalies such as cracks microcracks. With the help of these models, the maintenance team could 
plan an intervention to correct the default or a modification in the operating process, for a better productivity 
in the future. 

There are a variety of metrics used by the authors to evaluate the DL models performance, and each 
one is tailored to the model that was used in that particular research. For each article, we provide the best 
resulting metric in Table 1. The most often used metric was RMSE in [55], [56], [58]-[60], [62], [63], 
followed by MAE in [58]-[60], [63], both of these metrics represent an average model prediction error in 
units of the variable of interest, although calculating the square root of the average squared errors has some 
interesting consequences for the RMSE, and since the errors are squared before they are averaged, the RMSE 
provides a relatively high weight to big mistakes, this implies the RMSE should be more helpful when big 
errors are especially undesirable. The previous research used other metrics such as Precision and Recall [54], 
[64]; where the precision focuses on how precise/accurate the model is at predicting the positive outcomes, it 
is a useful metric to evaluate when the cost of false positive is significant, and the Recall essentially 
determines how many of the actual positives the model obtain via classifying it as Positive (True Positive), it 
is a useful metric to determine the best model if there is a significant cost tied with false negative. Where 
[57], [64] used the harmony and a balance of these last two metrics; the F1 Score. [65], [66] evaluated their 
DL models using the accuracy which is the most widely used classification model evaluation metric for its 
simplicity of use and understanding, where when it comes to this metric, many true negatives contribute very 
little, whereas false negatives or false positives usually incur the costs, so the Fl Score may be a better 
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indicator to use if we want to strike a balance between precision and recall and there is an uneven distribution 
of classes, and in one paper [54] the authors evaluated their model using the AUC metric in addition to other 
metrics such as precision and recall. Most of the reviewed papers used this type of evaluation (a mix of 
measures) to evaluate their models. We have seen that sometimes metrics have to be compromised for each 
other as showed in the paper [64]. Indeed, the model has a good performance regarding the recall metric 
(0.90), but the precision metric is showing a lower value (0.65). 

We notice that comparing papers is difficult, if not impossible since different metrics are used for 
different tasks, taking different models, datasets, and parameters into consideration. As a result, the reader 
should proceed with care while considering our opinions in this area. Another disadvantage of these models 
is the number of defaults that are discovered. These proposed methods are relatively performant when they 
are dealing with one default and this particularity could not encourage the implementation of this model in 
the maintenance industry. 


5. CONCLUSION 

Ensuring good performance over long periods of time is only possible by keeping an eye on and 
maintaining a PV power plant. To estimate the degradation of PV cells deep learning approaches were used. 
The goal of this research was to survey the trends in PV system maintenance based on deep learning and IoT 
during the last three years and look for ways to combine the two for fault detection and diagnostics in PV 
facilities in remote areas. According to our analysis, almost all of the studies used the well-known CNN or 
LTSM algorithms, and as a precaution, some researchers developed a model that can detect defaults that 
occur while the solar plant is operating, and most of them specialized in physical anomalies. Even while 
these proposed solutions are relatively performant when dealing with a single default, their performance may 
not be enough to entice the maintenance sector to use them. In this regard, there is a need that further 
research should focus on dealing with multiple defaults at the same time using the same model. This is the 
direction of our future works. 
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